In this example we’re going to use the built in faithful dataset which contains two columns: eruptions and waiting. We want to determine of the waiting times between Old Faithful eruptions are autocorrelated.

waiting
  [1] 79 54 74 62 85 55 88 85 51 85 54 84 78 47 83 52 62 84 52 79 51 47 78 69 74 83 55 76 78 79 73 77 66 80 74 52 48 80 59
 [40] 90 80 58 84 58 73 83 64 53 82 59 75 90 54 80 54 83 71 64 77 81 59 84 48 82 60 92 78 78 65 73 82 56 79 71 62 76 60 78
 [79] 76 83 75 82 70 65 73 88 76 80 48 86 60 90 50 78 63 72 84 75 51 82 62 88 49 83 81 47 84 52 86 81 75 59 89 79 59 81 50
[118] 85 59 87 53 69 77 56 88 81 45 82 55 90 45 83 56 89 46 82 51 86 53 79 81 60 82 77 76 59 80 49 96 53 77 77 65 81 71 70
[157] 81 93 53 89 45 86 58 78 66 76 63 88 52 93 49 57 77 68 81 81 73 50 85 74 55 77 83 83 51 78 84 46 83 55 81 57 76 84 77
[196] 81 87 77 51 78 60 82 91 53 78 46 77 84 49 83 71 80 49 75 64 76 53 94 55 76 50 82 54 75 78 79 78 78 70 79 70 54 86 50
[235] 90 54 54 77 79 64 75 47 86 63 85 82 57 82 67 74 54 83 73 73 88 80 71 83 56 79 78 84 58 83 43 60 75 81 46 90 46 74

\[ H_0: \text{there is no correlation} \\ H_a: \text{there is a correlation} \]

First, let’s look at the data for Old Faithful waiting times.

Now, to estimate autocorrelation, we can do the following:

(observed_cor <- cor(waiting[-length(waiting)], waiting[-1]))
[1] -0.5397898

Now, in order to determine if this value is significant, we need some way to determine a p value. Once again, we can rely on permutation testing to build a distribution of values under the null hypothesis and then compare our observed value to that distribution.

new_waiting <- sample(waiting)
cor(new_waiting[-length(new_waiting)], new_waiting[-1])
[1] 0.06117943
plot(new_waiting[1:50], type = "b")

head(results)
[1] -0.0464003824 -0.0494172073 -0.0320944716 -0.0029324479 -0.0005203281  0.0137261667

Now that we have results, which contains 10000 autocorrelation values under the null hypothesis, we can examine its distribution.

Now, let’s look at our observed_cor value in light of these results.

plot(density(results), xlim = c(observed_cor, max(results)))
abline(v = observed_cor, col = "red")

Given our observed_cor and the values contained in results, we can caluclate a p value and associated confidence interval as follows:

(p_value <- mean(abs(results) >= abs(observed_cor)))
[1] 0
ci <- p_value + c(-1, 1) * qnorm(.975) * sqrt(p_value * (1 - p_value) / n_permutations)
c(lower = ci[1],
  p_value = p_value,
  upper = ci[2])
  lower p_value   upper 
      0       0       0 
LS0tCnRpdGxlOiAiT2xkIEZhaXRoZnVsIC0gVGltZSBTZXJpZXMgQXV0b2NvcnJlbGF0aW9uIgpvdXRwdXQ6IGh0bWxfbm90ZWJvb2sKLS0tCgpgYGB7ciBzZXR1cCwgaW5jbHVkZSA9IEZBTFNFfQprbml0cjo6b3B0c19jaHVuayRzZXQoY2FjaGUgPSBUUlVFKQpgYGAKCkluIHRoaXMgZXhhbXBsZSB3ZSdyZSBnb2luZyB0byB1c2UgdGhlIGJ1aWx0IGluIGBmYWl0aGZ1bGAgZGF0YXNldCB3aGljaCBjb250YWlucwp0d28gY29sdW1uczogYGVydXB0aW9uc2AgYW5kIGB3YWl0aW5nYC4gV2Ugd2FudCB0byBkZXRlcm1pbmUgb2YgdGhlIHdhaXRpbmcgdGltZXMKYmV0d2VlbiBPbGQgRmFpdGhmdWwgZXJ1cHRpb25zIGFyZSBhdXRvY29ycmVsYXRlZC4KCmBgYHtyfQp3YWl0aW5nIDwtIGZhaXRoZnVsJHdhaXRpbmcKYGBgCgoKJCQKSF8wOiBcdGV4dHt0aGVyZSBpcyBubyBjb3JyZWxhdGlvbn0gXFwKSF9hOiBcdGV4dHt0aGVyZSBpcyBhIGNvcnJlbGF0aW9ufQokJAoKRmlyc3QsIGxldCdzIGxvb2sgYXQgdGhlIGRhdGEgZm9yIE9sZCBGYWl0aGZ1bCB3YWl0aW5nIHRpbWVzLgoKYGBge3J9CndhaXRpbmcgPC0gZmFpdGhmdWwkd2FpdGluZwpwbG90KHdhaXRpbmdbMTA6NzVdLCB0eXBlID0gImIiKQpgYGAKCk5vdywgdG8gZXN0aW1hdGUgYXV0b2NvcnJlbGF0aW9uLCB3ZSBjYW4gZG8gdGhlIGZvbGxvd2luZzoKCmBgYHtyfQpkYXRhLmZyYW1lKHdhaXRpbmdbLWxlbmd0aCh3YWl0aW5nKV0sIHdhaXRpbmdbLTFdKQoKd2FpdGluZ1stMV0Kd2FpdGluZ1stbGVuZ3RoKHdhaXRpbmcpXQpgYGAKCgoKCgoKCgpgYGB7cn0KKG9ic2VydmVkX2NvciA8LSBjb3Iod2FpdGluZ1stbGVuZ3RoKHdhaXRpbmcpXSwgd2FpdGluZ1stMV0pKQpgYGAKCk5vdywgaW4gb3JkZXIgdG8gZGV0ZXJtaW5lIGlmIHRoaXMgdmFsdWUgaXMgc2lnbmlmaWNhbnQsIHdlIG5lZWQgc29tZSB3YXkgdG8gCmRldGVybWluZSBhIHAgdmFsdWUuIE9uY2UgYWdhaW4sIHdlIGNhbiByZWx5IG9uIHBlcm11dGF0aW9uIHRlc3RpbmcgdG8gYnVpbGQgYSAKZGlzdHJpYnV0aW9uIG9mIHZhbHVlcyB1bmRlciB0aGUgbnVsbCBoeXBvdGhlc2lzIGFuZCB0aGVuIGNvbXBhcmUgb3VyIG9ic2VydmVkCnZhbHVlIHRvIHRoYXQgZGlzdHJpYnV0aW9uLgoKYGBge3J9Cm5ld193YWl0aW5nIDwtIHNhbXBsZSh3YWl0aW5nKQpjb3IobmV3X3dhaXRpbmdbLWxlbmd0aChuZXdfd2FpdGluZyldLCBuZXdfd2FpdGluZ1stMV0pCgpwbG90KG5ld193YWl0aW5nWzE6NTBdLCB0eXBlID0gImIiKQpgYGAKCgpgYGB7cn0Kbl9wZXJtdXRhdGlvbnMgPC0gMTAwMDAKcmVzdWx0cyA8LSByZXBsaWNhdGUobl9wZXJtdXRhdGlvbnMsIHsKICBuZXdfd2FpdGluZyA8LSBzYW1wbGUod2FpdGluZykKICBjb3IobmV3X3dhaXRpbmdbLWxlbmd0aChuZXdfd2FpdGluZyldLCBuZXdfd2FpdGluZ1stMV0pCn0pCgpoZWFkKHJlc3VsdHMpCmBgYAoKTm93IHRoYXQgd2UgaGF2ZSBgcmVzdWx0c2AsIHdoaWNoIGNvbnRhaW5zIDEwMDAwIGF1dG9jb3JyZWxhdGlvbiB2YWx1ZXMgdW5kZXIgdGhlCm51bGwgaHlwb3RoZXNpcywgd2UgY2FuIGV4YW1pbmUgaXRzIGRpc3RyaWJ1dGlvbi4KCmBgYHtyfQpwbG90KGRlbnNpdHkocmVzdWx0cykpCmBgYAoKTm93LCBsZXQncyBsb29rIGF0IG91ciBgb2JzZXJ2ZWRfY29yYCB2YWx1ZSBpbiBsaWdodCBvZiB0aGVzZSByZXN1bHRzLgoKYGBge3J9CnBsb3QoZGVuc2l0eShyZXN1bHRzKSwgeGxpbSA9IGMob2JzZXJ2ZWRfY29yLCBtYXgocmVzdWx0cykpKQphYmxpbmUodiA9IG9ic2VydmVkX2NvciwgY29sID0gInJlZCIpCmBgYAoKR2l2ZW4gb3VyIGBvYnNlcnZlZF9jb3JgIGFuZCB0aGUgdmFsdWVzIGNvbnRhaW5lZCBpbiBgcmVzdWx0c2AsIHdlIGNhbiBjYWx1Y2xhdGUKYSBwIHZhbHVlIGFuZCBhc3NvY2lhdGVkIGNvbmZpZGVuY2UgaW50ZXJ2YWwgYXMgZm9sbG93czoKYGBge3J9CihwX3ZhbHVlIDwtIG1lYW4oYWJzKHJlc3VsdHMpID49IGFicyhvYnNlcnZlZF9jb3IpKSkKYGBgCgpgYGB7cn0KY2kgPC0gcF92YWx1ZSArIGMoLTEsIDEpICogcW5vcm0oLjk3NSkgKiBzcXJ0KHBfdmFsdWUgKiAoMSAtIHBfdmFsdWUpIC8gbl9wZXJtdXRhdGlvbnMpCmMobG93ZXIgPSBjaVsxXSwKICBwX3ZhbHVlID0gcF92YWx1ZSwKICB1cHBlciA9IGNpWzJdKQpgYGAKCg==